Conversation
This comment has been minimized.
This comment has been minimized.
5217fd7 to
0ace3e7
Compare
This comment has been minimized.
This comment has been minimized.
|
I wonder how hard it would be to store true 32bit pointers in the const eval allocation for the vtable. That would avoid all hacks elsewhere around the size mismatch between const eval and runtime. |
0ace3e7 to
d58809f
Compare
This comment has been minimized.
This comment has been minimized.
|
☔ The latest upstream changes (presumably #147475) made this pull request unmergeable. Please resolve the merge conflicts. |
This is a WIP patch for implementing rust-lang/compiler-team#903. It adds a new unstable flag `-Zexperimental-relative-rust-abi-vtables` that makes vtables PIC-friendly. This is only supported for LLVM codegen and not supported for other backends. Early feedback on this is welcome. I'm not sure if how I implemented it is the best way of doing so since much of the actual vtable emission is heavily done during LLVM codegen. That is, the vtable to MIR looks like a normal table of pointers and byte arrays and I really only make the vtables relative on the codegen level. Locally, I can build the stage 1 compiler and runtimes with relative vtables, but I couldn't figure out how to tell the build system to only build stage 1 binaries with this flag, so I work around this by unconditionally enabling relative vtables in rustc. The end goal I think we'd like is either something akin to multilibs in clang where the compiler chooses which runtimes to use based off compilation flags, or binding this ABI to the target and have it be part of the default ABI for that target (just like how relative vtables are the default for Fuchsia in C++ with Clang). I think the later is what target modifiers do (rust-lang#136966). Action Items: - I'm still experimenting with building Fuchsia with this to assert it works e2e and I still need to do some measurements to see if this is still worth pursuing. - More work will still be needed to ensure the correct relative intrinsics are emitted with CFI and LTO. Rn I'm experimenting on a normal build.
d58809f to
6ff8b5f
Compare
|
The job Click to see the possible cause of the failure (guessed by this bot) |
|
☔ The latest upstream changes (presumably #152934) made this pull request unmergeable. Please resolve the merge conflicts. |
|
I'm testing this patch on my random crates including some vtable-heavy ones. It reduces binary size from 1% to 5%, mainly from cutting down dynamic relocations ( However, I got some SEGFAULT at runtime due to vtable layout mismatch between const-eval and runtime (as mentioned above). That is, fn main() {
const X: &dyn std::fmt::Display = &42i32; // absolute fnptr vtable
println!("{X}"); // assume it is relative, oops
}I also got a weird compile error with no further information when compiling rust-analyzer |
My colleague Erick has a more up-to-date verison of this at main...erickt:rust:relative-vtables which should include support for building the runtimes and (hopefully) has fixes for the merge conflicts I didn't have time to address here, so you might get more luck trying that out. (Fair warning: some of those updates there were vibe-coded, but they do seem to get rustc and runtime tests passing and we can build a bunch of downstream rust projects with it.) I'll eventually come back and clean this PR up, but we're still trying to collect some numbers on the side. It could be we missed a few cases though. If there are any runtime assumptions about the vtable ABI, then those will need to be changed as well. Same for const-eval which I'm not sure if I remember tackling in my initial patch. |
|
@oxalica - thanks for trying it out! Which crates are you testing it with? As @PiJoules said, we've got this patch passing the Rust test suite, and working with servo, tokio, ripgrep, and chrome, and also showing between 0.25 to 4%-ish savings. I just need to get come performance numbers before resuming talks with the compiler team. I'd be happy to see if I can reproduce the segfaults. |
|
Thanks both of you for the work!
A public one is
I'm testing this PR rebased onto 99246f4 which is the latest non-conflicting commit. It may be a bit out-of-date. If there are more updates (that fixes merge conflict), it would be good to push into this PR to make testing easier. The crash happens on
To me the runtime cost of vtable calls does not matter much. It is already assumed that vtable call would be slow due to the non-inline-able call and branch misprediction, and another <1cycle add instruction is nothing. The main intention of my use case is to reduce code size and the startup cost. Absolute relocations increase the work to be done during dynamic linking before main, and reduce memory share (more data in process-private |
This is a WIP patch for implementing rust-lang/compiler-team#903. It adds a new unstable flag
-Zexperimental-relative-rust-abi-vtablesthat makes vtables PIC-friendly. This is only supported for LLVM codegen and not supported for other backends.Early feedback on this is welcome. I'm not sure if how I implemented it is the best way of doing so since much of the actual vtable emission is heavily done during LLVM codegen. That is, the vtable to MIR looks like a normal table of pointers and byte arrays and I really only make the vtables relative on the codegen level.
Locally, I can build the stage 1 compiler and runtimes with relative vtables, but I couldn't figure out how to tell the build system to only build stage 1 binaries with this flag, so I work around this by unconditionally enabling relative vtables in rustc. The end goal I think we'd like is either something akin to multilibs in clang where the compiler chooses which runtimes to use based off compilation flags, or binding this ABI to the target and have it be part of the default ABI for that target (just like how relative vtables are the default for Fuchsia in C++ with Clang). I think the later is what target modifiers do (#136966).
Action Items: